AITopics | sufficient information

Collaborating Authors

sufficient information

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CaRT: Teaching LLM Agents to Know When They Know Enough

Liu, Grace, Qu, Yuxiao, Schneider, Jeff, Singh, Aarti, Kumar, Aviral

arXiv.org Artificial IntelligenceOct-10-2025

Many tasks require learned models to strategically gather relevant information over multiple rounds of interaction before actually acting on a task. Strategic information gathering requires models to know not only how to effectively acquire information, but also when to stop gathering information and make a decision, in order to avoid overthinking or getting derailed when acting. In this paper, we formalize this problem and introduce Counterfactuals and Reasoning for Termination (CaRT), an approach for teaching LLMs when to stop seeking information. To appropriately learn when to terminate, CaRT fine-tunes LLMs using counterfactual pairs of trajectories, one where termination is appropriate and a minimally modified version of the same trajectory where it is not. It trains the LLM to explain the rationale for the termination decision in either case via verbal reasoning, and imbues this capability into the base LLM via fine-tuning. We instantiate CaRT in two domains: interactive medical diagnosis and math problem solving. In both domains, we find that CaRT improves the efficiency of information gathering and task success rate compared to other fine-tuning methods.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.08517

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Haunted House: A text-based game for comparing the flexibility of mental models in humans and LLMs

Puppart, Brett, Paltmann, Paul-Henry, Aru, Jaan

arXiv.org Artificial IntelligenceFeb-12-2025

The advent of transformer-based large language models (LLMs) has reignited the philosophical debate of human significance - a question that has persisted for millennia. Aristotle thought the function of humans was to live according to the rational principle, which was something that distinguished us from other animals (Aristotle, 2014) . Back then, this might have seemed like a reasonable conclusion, as humans use complex language and abstract thinking to a degree that other animals simply do not. However, recent advancements in artificial intelligence (AI) are shining light on the possibility that in the future we might be living in a world in which our creation is more intelligent than us - or perhaps that this world is already here. In many benchmarks comparing humans and AI, LLMs have shown a trend of rapid increase in performance. In SimpleBench, which measures common sense reasoning and social intelligence, GPT-4o scored only 17.8% and o1-preview 41.7% (Philip & Hemang, 2024) .

instruction, llm, participant, (17 more...)

arXiv.org Artificial Intelligence

2503.16437

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can foundation models actively gather information in interactive environments to test hypotheses?

Ke, Nan Rosemary, Sawyer, Danny P., Soyer, Hubert, Engelcke, Martin, Reichert, David P, Hudson, Drew A., Reid, John, Lerchner, Alexander, Rezende, Danilo Jimenez, Lillicrap, Timothy P, Mozer, Michael, Wang, Jane X

arXiv.org Machine LearningDec-9-2024

While problem solving is a standard evaluation task for foundation models, a crucial component of problem solving -- actively and strategically gathering information to test hypotheses -- has not been closely investigated. To assess the information gathering abilities of foundation models in interactive environments, we introduce a framework in which a model must determine the factors influencing a hidden reward function by iteratively reasoning about its previously gathered information and proposing its next exploratory action to maximize information gain at each step. We implement this framework in both a text-based environment, which offers a tightly controlled setting and enables high-throughput parameter sweeps, and in an embodied 3D environment, which requires addressing complexities of multi-modal interaction more relevant to real-world applications. We further investigate whether approaches such as self-correction and increased inference time improve information gathering efficiency. In a relatively simple task that requires identifying a single rewarding feature, we find that LLM's information gathering capability is close to optimal. However, when the model must identify a conjunction of rewarding features, performance is suboptimal. The hit in performance is due partly to the model translating task description to a policy and partly to the model's effectiveness in using its in-context memory. Performance is comparable in both text and 3D embodied environments, although imperfect visual object recognition reduces its accuracy in drawing conclusions from gathered information in the 3D embodied case. For single-feature-based rewards, we find that smaller models curiously perform better; for conjunction-based rewards, incorporating self correction into the model improves performance.

information, large language model, machine learning, (18 more...)

arXiv.org Machine Learning

2412.06438

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Teaching Smaller Language Models To Generalise To Unseen Compositional Questions (Full Thesis)

Hartill, Tim

arXiv.org Artificial IntelligenceNov-25-2024

Pretrained large Language Models (LLMs) are able to answer questions that are unlikely to have been encountered during training. However a diversity of potential applications exist in the broad domain of reasoning systems and considerations such as latency, cost, available compute resource and internet connectivity are relevant in determining an appropriate approach. We consider the setting where some local compute capacity is available at inference time but internet connectivity is not. Similar to a general-purpose LLM, we assume that our much smaller Reasoning Models may be asked arbitrary questions from unknown distributions, so we focus on evaluation in an unseen setting. We train our models to answer diverse questions by instilling an ability to reason over a retrieved context. We acquire context from two knowledge sources; a Wikipedia corpus queried using a multi-hop dense retrieval system with novel extensions, and from rationales generated from a larger Language Model optimised to run in a lower resource environment. Our main contributions: We propose novel methods to show that our model is capable of answering contextualised questions without memorisation. We establish a comprehensive set of baseline results on unseen evaluation datasets. We show that the addition of novel retrieval-augmented training datasets (RATD) to the training regime of the Reasoning Model significantly improves results. We demonstrate further significant improvement through the application of methods for combining knowledge from two sources. The first method (RR) involves training a novel Rationale Ranking model to score both generated rationales and retrieved contexts with respect to relevance and truthfulness. We use the scores to derive combined contexts. We also show that utilising the RATD datasets enables our model to become proficient at utilising combined noisy contexts.

artificial intelligence, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2411.16985

Country:

Asia > Middle East > Iraq (0.14)
North America > United States > Texas > Harris County > Houston (0.13)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
(25 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Energy (1.00)
Health & Medicine > Therapeutic Area (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

BEAVER: An Enterprise Benchmark for Text-to-SQL

Chen, Peter Baile, Wenz, Fabian, Zhang, Yi, Kayali, Moe, Tatbul, Nesime, Cafarella, Michael, Demiralp, Çağatay, Stonebraker, Michael

arXiv.org Artificial IntelligenceSep-3-2024

Existing text-to-SQL benchmarks have largely been constructed using publicly available tables from the web with human-generated tests containing question and SQL statement pairs. They typically show very good results and lead people to think that LLMs are effective at text-to-SQL tasks. In this paper, we apply off-the-shelf LLMs to a benchmark containing enterprise data warehouse data. In this environment, LLMs perform poorly, even when standard prompt engineering and RAG techniques are utilized. As we will show, the reasons for poor performance are largely due to three characteristics: (1) public LLMs cannot train on enterprise data warehouses because they are largely in the "dark web", (2) schemas of enterprise tables are more complex than the schemas in public data, which leads the SQL-generation task innately harder, and (3) business-oriented questions are often more complex, requiring joins over multiple tables and aggregations. As a result, we propose a new dataset BEAVER, sourced from real enterprise data warehouses together with natural language queries and their correct SQL statements which we collected from actual user history. We evaluated this dataset using recent LLMs and demonstrated their poor performance on this task. We hope this dataset will facilitate future researchers building more sophisticated text-to-SQL systems which can do better on this important class of data.

dataset, user question, varchar2, (16 more...)

arXiv.org Artificial Intelligence

2409.02038

Country:

Asia > Middle East > UAE (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(2 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.69)

Industry: Education > Educational Setting (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Information Compression in Dynamic Games

Tang, Dengwang, Subramanian, Vijay, Teneketzis, Demosthenis

arXiv.org Artificial IntelligenceJul-17-2024

One of the reasons why stochastic dynamic games with an underlying dynamic system are challenging is since strategic players have access to enormous amount of information which leads to the use of extremely complex strategies at equilibrium. One approach to resolve this challenge is to simplify players' strategies by identifying appropriate compression of information maps so that the players can make decisions solely based on the compressed version of information, called the information state. For finite dynamic games with asymmetric information, inspired by the notion of information state for single-agent control problems, we propose two notions of information states, namely mutually sufficient information (MSI) and unilaterally sufficient information (USI). Both these information states are obtained with information compression maps independent of the strategy profile. We show that Bayes-Nash Equilibria (BNE) and Sequential Equilibria (SE) exist when all players use MSI-based strategies. We prove that when all players employ USI-based strategies the resulting sets of BNE and SE payoff profiles are the same as the sets of BNE and SE payoff profiles resulting when all players use full information-based strategies. We prove that when all players use USI-based strategies the resulting set of weak Perfect Bayesian Equilibrium (wPBE) payoff profiles can be a proper subset of all wPBE payoff profiles. We identify MSI and USI in specific models of dynamic games in the literature. We end by presenting an open problem: Do there exist strategy-dependent information compression maps that guarantee the existence of at least one equilibrium or maintain all equilibria that exist under perfect recall? We show, by a counterexample, that a well-known strategy-dependent information compression map used in the literature does not possess any of the properties of MSI or USI.

information, strategy profile, sufficient information, (14 more...)

arXiv.org Artificial Intelligence

2407.12318

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.27)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

Decision Theoretic Foundations for Experiments Evaluating Human Decisions

Hullman, Jessica, Kale, Alex, Hartline, Jason

arXiv.org Artificial IntelligenceJan-25-2024

Decision-making with information displays is a key focus of research in areas like explainable AI, human-AI teaming, and data visualization. However, what constitutes a decision problem, and what is required for an experiment to be capable of concluding that human decisions are flawed in some way, remain open to speculation. We present a widely applicable definition of a decision problem synthesized from statistical decision theory and information economics. We argue that to attribute loss in human performance to forms of bias, an experiment must provide participants with the information that a rational agent would need to identify the normative decision. We evaluate the extent to which recent evaluations of decision-making from the literature on AI-assisted decisions achieve this criteria. We find that only 6 (17\%) of 35 studies that claim to identify biased behavior present participants with sufficient information to characterize their behavior as deviating from good decision-making. We motivate the value of studying well-defined decision problems by describing a characterization of performance losses they allow us to conceive. In contrast, the ambiguities of a poorly communicated decision problem preclude normative interpretation. We conclude with recommendations for practice.

decision problem, information, participant, (14 more...)

arXiv.org Artificial Intelligence

2401.15106

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
Asia > Malaysia (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.46)
Government > Voting & Elections (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(4 more...)

Add feedback

Incorporating sufficient physical information into artificial neural networks: a guaranteed improvement via physics-based Rao-Blackwellization

Geuken, Gian-Luca, Mosler, Jörn, Kurzeja, Patrick

arXiv.org Artificial IntelligenceNov-10-2023

The concept of Rao-Blackwellization is employed to improve predictions of artificial neural networks by physical information. The error norm and the proof of improvement are transferred from the original statistical concept to a deterministic one, using sufficient information on physics-based conditions. The proposed strategy is applied to material modeling and illustrated by examples of the identification of a yield function, elasto-plastic steel simulations, the identification of driving forces for quasi-brittle damage and rubber experiments. Sufficient physical information is employed, e.g., in the form of invariants, parameters of a minimization problem, dimensional analysis, isotropy and differentiability. It is proven how intuitive accretion of information can yield improvement if it is physically sufficient, but also how insufficient or superfluous information can cause impairment. Opportunities for the improvement of artificial neural networks are explored in terms of the training data set, the networks' structure and output filters. Even crude initial predictions are remarkably improved by reducing noise, overfitting and data requirements.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2311.06147

Country: Europe > Germany (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Published rANS patent by Storeleap - Page 5

#artificialintelligenceMar-7-2021, 08:35:54 GMT

Features of range asymmetric number system encoding and decoding Abstract Innovations in range asymmetric number system ("RANS") coding and decoding are described herein. Some of the innovations relate to hardware implementations of RANS decoding that organize operations in two phases, which can improve the computational efficiency of RANS decoding. Other innovations relate to adapting RANS encoding/decoding for different distributions or patterns of values for symbols. For example, RANS encoding/decoding can adapt by switching a default symbol width (the number of bits per symbol), adjusting symbol width on a fragment-by-fragment basis for different fragments of symbols, switching between different static probability models on a fragment-by-fragment basis for different fragments of symbols, and/or selectively flushing (or retaining) the state of a RANS decoder on a fragment-by-fragment basis for different fragments of symbols. In many cases, such innovations can improve compression efficiency while also providing computationally efficient performance.

iteration, output symbol, ran decoder, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

The Prevalence of Errors in Machine Learning Experiments

Shepperd, Martin, Guo, Yuchen, Li, Ning, Arzoky, Mahir, Capiluppi, Andrea, Counsell, Steve, Destefanis, Giuseppe, Swift, Stephen, Tucker, Allan, Yousefi, Leila

arXiv.org Artificial IntelligenceSep-10-2019

Context: Conducting experiments is central to research machine learning research to benchmark, evaluate and compare learning algorithms. Consequently it is important we conduct reliable, trustworthy experiments. Objective: We investigate the incidence of errors in a sample of machine learning experiments in the domain of software defect prediction. Our focus is simple arithmetical and statistical errors. Method: We analyse 49 papers describing 2456 individual experimental results from a previously undertaken systematic review comparing supervised and unsupervised defect prediction classifiers. We extract the confusion matrices and test for relevant constraints, e.g., the marginal probabilities must sum to one. We also check for multiple statistical significance testing errors. Results: We find that a total of 22 out of 49 papers contain demonstrable errors. Of these 7 were statistical and 16 related to confusion matrix inconsistency (one paper contained both classes of error). Conclusions: Whilst some errors may be of a relatively trivial nature, e.g., transcription errors their presence does not engender confidence. We strongly urge researchers to follow open science principles so errors can be more easily be detected and corrected, thus as a community reduce this worryingly high error rate with our computational experiments.

artificial intelligence, experiment, machine learning, (13 more...)

arXiv.org Artificial Intelligence

1909.04436

Country:

Asia > China (0.14)
Europe (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback